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Abstract 

We present a new approach to formal language theory using Kol- 
mogorov complexity. The main results presented here are an alternative 
for pumping lemma(s), a new characterization for regular languages, and 
a new method to separate deterministic context-free languages and nonde- 
terministic context-free languages. The use of the new 'incompressibility 
arguments' is illustrated by many examples. The approach is also suc- 
cessful at the high end of the Chomsky hierarchy since one can quantify 
nonrecursiveness in terms of Kolmogorov complexity. (This is a prelimi- 
nary uncorrected version. The final version is the one published in SIAM 
J. Comput, 24:2(1995), 398-410.) 

1 Introduction 

It is feasible to reconstruct parts of formal language theory using algorithmic 
information theory (Kolmogorov complexity). We provide theorems on how to 
use Kolmogorov complexity as a concrete and powerful tool. We do not just want 
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to introduce fancy mathematics; our goal is to help our readers do a large part 
of formal language theory in the most essential, usually easiest, sometimes even 
obvious ways. In this paper it is only important to us to demonstrate that the 
application of Kolmogorov complexity in the targeted area is not restricted to 
trivialities. The proofs of the theorems in this paper may not be easy. However, 
the theorems are of the type that are used as a tool. Once derived, our theorems 
are easy to apply. 

1.1 Prelude 

The first application of Kolmogorov complexity in the theory of computation 
was in ]T9| , |2C| |. By re-doing proofs of known results, it was shown that static, 
descriptional (program size) complexity of a single random string can be used 
to obtain lower bounds on dynamic, computational (running time) complexity. 
None of the inventors of Kolmogorov complexity originally had these applica- 
tions in mind. Recently, Kolmogorov complexity has been applied extensively 
to solve classic open problems of sometimes two decades standing, ||, |l^ . 

For more examples see the textbook 

The secret of Kolmogorov complexity's success in dynamic, computational 
lower bound proofs rests on a simple fact: the overwhelming majority of strings 
has hardly any computable regularities. We call such a string 'Kolmogorov ran- 
dom' or 'incompressible'. A Kolmogorov random string cannot be (effectively) 
compressed. Incompressibility is a noneffective property: no individual string, 
except finitely many, can be proved incompressible. 

Recall that a traditional lower bound proof by counting usually involves all 
inputs of certain length. One shows that a certain lower bound has to hold 
for some 'typical' input. Since an individual typical input is hard (sometimes 
impossible) to find, the proof has to involve all the inputs. Now we understand 
that a typical input of each length can be constructed via an incompressible 
string. However, only finitely many individual strings can be effectively proved 
to be incompressible. No wonder the old counting arguments had to involve 
all inputs. In a proof using the new 'incompressibility method', one uses an 
individual incompressible string that is known to exist even though it cannot 
be constructed. Then one shows that if the assumed lower time bound would 
not hold, then this string could be compressed, and hence it would not be 
incompressible. 

1.2 Outline of the Paper 

The incompressibility argument above also works for formal languages and au- 
tomata theory proper. Assume the basic notions treated in a textbook like 

The first result is a powerful alternative to pumping lemmas for regular 
languages. It is well known that not all nonregular languages can be shown to be 
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nonregular by the usual Mww-pumping lemma. There is a plethora of pumping 
lemmas to show nonregularity, like the 'marked pumping lemma', and so on. 
In fact, it seems that many example nonregular languages require their own 
special purpose pumping lemmas. Comparatively recently, ^, exhaustive 

pumping lemmas that characterize the regular languages have been obtained. 

These pumping lemmas are complicated and complicated to use. The last 
reference uses Ramsey theory. In contrast, using Kolmogorov complexity we 
give a new characterization of the regular languages that simply makes our 
intuition of 'finite state'ness of these languages rigorous and is easy to apply. 
Being a characterization it works for all non-regular languages. We give several 
examples of its application, some of which were quite difhcult using pumping 
lemmas. 

To prove that a certain context-free language (cfl) is not deterministic context- 
free (dcfl) has required laborious ad hoc proofs, [0, or cumbersome-to-state and 
also difficult-to-apply pumping lemmas or iteration theorems [||, . We give 
necessary (Kolmogorov complexity) conditions for dcfl, that are very easy to ap- 
ply. We test the new method on several examples in cfl — dcfl, which were hard 
to handle before. In certain respects the KC-DCFL Lemma may be more pow- 
erful than the related lemmas and theorems mentioned above. On the high end 
of the Chomsky hierarchy we present, for completeness, a known characteriza- 
tion of recursive languages, and a necessary condition for recursively enumerable 
languages. 

2 Kolmogorov Complexity 

From now on, let x denote both the natural number and the xth. binary string 
in the sequence 0, 1, 00, 01, 10, 11, 000, . . . That is, the representation '3' corre- 
sponds both to the natural number 3 and to the binary string 00. This way we 
obtain a natural bijection between the nonnegative integers J\f and the finite 
binary strings {0, 1}*. Numerically, the binary string x„_i . . . corresponds to 
the integer 

n-l 

2"-l + ^a;,2\ (1) 

We use notation l{x) to denote the length (number of bits) of a binary 
string X. If x is not a finite binary string but another finite object like a finite 
automaton, a recursive function, or a natural number, then we use l{x) to 
denote the length of its standard binary description. Let (•,•} iTVxTV^AAbe 
a standard recursive, invertible, one-one encoding of pairs of natural numbers 
in natural numbers. This idea can be iterated to obtain a pairing from triples 
of natural numbers with natural numbers {x, y, z) = {x, {y, z)), and so on. 

Any of the usual definitions of Kolmogorov complexity in |l^ will do 

for the sequel. We are interested in the shortest effective description of a finite 



3 



object X. To fix thoughts, consider the problem of describing a string x over O's 
and I's. Let Ti,T2, ... be the standard enumeration of Turing machines. Since 
Ti computes a partial recursive function : J\f Af we obtain the standard 
enumeration , (j)2 , ■ ■ ■ of partial recursive functions. We denote (l){{x,y)) as 
(j){x,y). Any partial recursive function cj) from strings over O's and I's to such 
strings, together with a string p, the program for (jj to compute x, such that 
4>{p) = a;, is a description of x. It is useful to generalize this idea to the 
conditional version: (f)(p, y) = x such that p is a program for to compute a;, 
given a binary string y for free. Then the descriptional complexity of x, 
relative to (j) and y, is defined by 

C4,{x\y) = mm{l{p) : _p £ {0, 1}*, y) = x}, 

or oo if no such p exists. 

For a universal partial recursive function 0o, computed by the universal 
Turing machine U, we know that, for each partial recursive function cf), there is 
a constant such that for all strings x, y, we have (f>o{i, x, y) — 4>{x, y). Hence, 
Ccj,o{x\y) < C^{x\y) + Ccj,. We fix a reference universal function 4>q and define 
the conditional Kolmogorov complexity of x given y as C{x\y) = C(f,g(x\y). |^ 

The unconditional Kolmogorov complexity of x is C{x) — C{x\e), where e 
denotes the empty string {l{e) — 0). 

Since there is a Turing machine that just copies its input to its output 
we have C{x\y) < l{x) + 0{1), for each x and y. Since there are 2" binary 
strings of length n, but only 2" — 1 possible shorter descriptions d, it follows 
that C{x) > l{x) for some binary string x of each length. We call such strings 
incompressible or random. It also follows that, for any length n and any 
binary string y, there is a binary string x of length n such that C{x\y) > l{x). 
Considering C as an integer function, using the obvious one-one correspondence 
between finite binary words and nonnegative integers, it can be shown that 
C{x) — > oo for a: ^ oo. Finally, C{x,y) denotes C{{x,y)). 

Example 1 (Self-Delimiting Strings) A prefix code is a mapping from fi- 
nite binary code words to source words, such that no code word is a proper 
prefix of any other code word. We define a particular prefix code. 

For each binary source word define the code word x by 

X = l^'-^'^Ox. 

Define 

x' = l{x)x. 

The string x' is called the self- delimiting code of x. 

Set X = 01011. Then, l{x) ~ 5, which corresponds to binary string '10', and 
T(xj = 11010. Therefore, x' = 1101001011 is the self-delimiting code of '01011'. 

1 Similarly, we define the complexity of the xth partial recursive function <p conditional to 
the yth partial recursive function i/i by C{<j>\4>) = C{x\y). 
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The self-delimiting code of a positive integer x requires l{x) + 2\ogl{x) + 1 
bits. It is easy to verify that l{x) = [log(a; + 1)J. All logarithms are base 2 
unless otherwise noted. For convenience, we simply denote the length l{x) of a 
natural number x by 'logx'. O 

Example 2 (Substrings of incompressible strings) Is a substring of an 
incompressible string also incompressible? A string x = uvw can be specified 
by a short description for v of length C{v), a description of l{u), and the literal 
description of uw. Moreover, we need information to tell these three items apart. 
Such information can be provided by prefixing each item with a self-delimiting 
description of its length. Together this takes C{v) + l{uw) + 0{logl{x)) bits. 
Hence, 

C{x) < C{v) + 0{logl{x)) + l{uw), 
Thus, if we choose x incompressible, C{x) > l{x), then we obtain 

C{v) > l{v) - 0{logl{x)). 

It can be shown that this is optimal — a substring of an incompressible string 
of length n can be compressible by an O(logn) additional term. This conforms 
to a fact we know from probability theory: every random string of length n is 
expected to contain a run of about logn consecutive zeros (or ones). Such a 
substring has complexity O (log logn). O 

3 Regular Sets and Finite Automata 

Definition 1 Let S be a finite nonempty alphabet, and let Q be a (possibly 
infinite) nonempty set of states. A transition function is a function 5 : S x Q — > 
Q. We extend 6 to 6' on S* by d'(e, q) = q and 

5\ai ...an,q)= 5(a„, 6'{ai . . . a„_i, g)). 

Clearly, if S' is not 1 — 1, then the automaton 'forgets' because some x and y 
from S* drive 6' into the same memory state. An automaton A is a quintuple 
{T,,Q,6,qo,qf) where everything is as above and qo,qf G Q are distinguished 
initial state and final state, respectively. We call A a finite automaton (fa) if Q 
is finite. 

We denote 'indistinguishability' of a pair of histories x,y £ T,* hy x ^ y, 
defined as S'{x, qo) — 6'{y, qo). 'Indistinguishability' of strings is reflexive, sym- 
metric, transitive, and right-invariant (S'{xz,qo) = 6'{yz,qo) for all z). Thus, 
'indistinguishability' is a right-invariant equivalence relation on E*. It is a sim- 
ple matter to ascertain this formally. 
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Definition 2 The language accepted by automaton A as above is the set L = 
{x : S'{x,qo) — g/}. A regular language is a language accepted by a finite 
automaton. 

It is a straightforward exercise to verify from the definitions the following 
fact (which will be used later). 

Theorem 1 (Myhill, Nerode) The following statements about L C E* are 
equivalent. 

(i) L C S* is accepted by some finite automaton. 

(ii) L is the union of equivalence classes of a right-invariant equivalence 
relation of finite index on E*. 

(iii) For all x,y G E* define right-invariant equivalence x ^ y by: for all 
z e E* we have xz E L iff yz L. Then the number of ^-equivalence classes is 
finite. 

Subsequently, closure of finite automaton languages under complement, union, 
and intersection follow by simple construction of the appropriate 6 functions 
from given ones. Details can be found in any textbook on the subject like 
The clumsy pumping lemma approach can now be replaced by the Kolmogorov 
formulation below. 

3.1 Kolmogorov Complexity Replacement for the Pump- 
ing Lemma 

An important part of formal language theory is deriving a hierarchy of language 
families. The main division is the Chomsky hierarchy, with regular languages, 
context-free languages, context-sensitive languages and recursively enumerable 
languages. The common way to prove that certain languages are not regular 
is by using 'pumping' lemmas, for instance, the uvw-lemma. However, these 
lemmas are quite difficult to state and cumbersome to prove or use. In contrast, 
below we show how to replace such arguments by simple, intuitive and yet 
rigorous, Kolmogorov complexity arguments. 

Regular languages coincide with the languages accepted by finite automata. 
This invites a straightforward application of Kolmogorov complexity. Let us give 
an example. We prove that {O'^l''' : fc > 1} is not regular. If it were, then the 
state q of a particular accepting fa after processing 0*^, together with the fa, is, 
up to a constant, a description of fc. Namely, by running A, initialized in state <?, 
on input consisting of only I's, the first time A enters an accepting state is after 
precisely k consecutive I's. The size of the description of A and q is bounded 
by a constant, say c, which is independent of k. Altogether, it follows that 
C{k) < c + 0(1). But choosing fc with C{k) > logfc we obtain a contradiction 
for all large enough fc. Hence, since the fa has a fixed finite number of states, 
there is a fixed finite number that bounds the Kolmogorov complexity of each 
natural number: contradiction. We generalize this observation as follows. 



6 



Definition 3 Let S be a finite nonempty alphabet, and let : TV ^ S* be a 
total recursive function. Then </) enumerates (possibly a proper subset of) E* 
in order 0(1), (/'(2), . . . We call such an order effective, and (/) an enumerator. 

The lexicographical order is the effective order such that all words in E* are 
ordered first according to length, and then lexicographically within the group 
of each length. Another example is (f> such that (f>{i) — Pi, the standard binary 
representation of the ith prime, is an effective order in {0, 1}*. In this case </> 
does not enumerate all of S*. Let i C S*. Define ^ {y : xy ^ L}. 

Lemma 1 (KC-Regularity) Let L C E* be regular, and let 4> an enumerator 
in E*. Then there exists a constant c depending only on L and (j), such that for 
each X, if y is the nth string enumerated in (or in the complement of) L^, then 
C{y)<C{n)+c. 

Proof. Let i be a regular language. The nth string y such that xy E L for 
some X can be described by 

• this discussion, and a description of the fa that accepts L; 

• a description of 0; and 

• the state of the fa after processing x, and the number n. 

The statement "(or in the complement of)" follows, since regular languages are 
closed under complementation. □ 

As an application of the KC-Regularity Lemma we prove that {V : p is 
prime} is not regular. Consider the string xy — V with p the (fc + l)th prime. 
Set a; = 1^ , with p' the fcth prime. Then y — 1P~p , and y is the lexicographical 
first element in L^. Hence, by Lemma |l|, C{p — p') = 0{\). But the difference 
between two consecutive primes grows unbounded. Since there are only 0(1) 
descriptions of length 0(1), we have a contradiction. We give some more exam- 
ples from the well-known textbook of Hopcroft and UUman that are marked * 
as difficult there: 

Example 3 (Exercise 3.1(h)* in Q) Show L {xx^w : x,w e {0,1}* - 
{e}} is not regular. Set x — (01)™, where C{m) > logm. Then, the lexi- 
cographically first word in is y with y ~ (10)™0. But, C{y) — V,{\ogm), 
contradicting the KC-Regularity Lemma. O 

Example 4 Prove that L = {O'l^ : * / j} is not regular. Set x = 0™, where 
C{m) > log TO. Then, the lexicographically first word not'm C\{i}* is y = 1™- 
But, C{y) — ft{\ogm), contradicting the KC-Regularity Lemma. O 
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Example 5 (Exercise 3.6* in [0]) Prove that L = {O'l^ : gcd(i, j) = 1} 
is not regular. Set x — O^f"^-*'!, where p > 3 is a prime, l{p) = n and 
C{p) > \ogn — log log n. Then the lexicographically first word in Lx is l''^^, 
contradicting the KC-regularity lemma. O 



Example 6 (Section 2.2, Exercises 11-15, Q) Prove that {p : p is the 
standard binary representation of a prime } is not regular. Suppose the con- 
trary, and Pi denotes the ith prime, i > I. Consider the least binary pm — uv 
(= it2'('") -I- v), with u = Ili<kPi and v not in {0}*{1}. Such a prime pm exists 
since each interval [n, n + n^^^^"] of the natural numbers contains a prime, |^. 

Considering now as an integer, pm = 2^^'"^Ili^kPi +v. Since integer v > 1 
and V is not divided by any prime less than pk (because Pm is prime) , the binary 
length l{v) > l{pk)- Because pfc goes to infinity with fc, the value C{v) > C{l{v)) 
also goes to infinity with k. But since v is the lexicographical first suffix, with 
integer ?; > 1, such that uv G L, we have C{v) = 0(1) by the KC-Regularity 
Lemma, which is a contradiction. O 



3.2 Kolmogorov Complexity Characterization of Regular 
Languages 

While the pumping lemmas are not precise enough (except for the difficult con- 
struction in jij) to characterize the regular languages, with Kolmogorov com- 
plexity this is easy. In fact, the KC-Regularity Lemma is a direct corollary of 
the characterization below. The theorem is not only a device to show that some 
nonregular languages are nonregular, as are the common pumping lemmas, but 
it is a characterization of the regular sets. Consequently, it determines whether 
or not a given language is regular, just like the Myhill-Nerode Theorem. The 
usual characterizations of regular languages seem to be practically useful to 
show regularity. The need for pumping lemmas stems from the fact that char- 
acterizations tend to be very hard to use to show nonregularity. In contrast, 
the KC-characterization is practicable for both purposes, as evidenced by the 
examples. 

Definition 4 Let S be a nonempty finite alphabet, and let be the ith el- 
ement of S* in lexicographic order, i > 1. For i C S* and a; G E*, let 
X = X1X2 ... be the characteristic sequence of = {y '■ xy € L}, defined by 
Xi = 1 if xyi G i, and Xi ~ ^ otherwise. We denote Xi • • ■ Xn by Xi:n- 



Theorem 2 (Regular KC-Characterization) Let i C E*, and assume 
the notation above. The following statements are equivalent. 

(i) L is regular. 

(ii) There is a constant cl depending only on L, such that for all a; £ S*, 
for all n, C{xi:n\n) < cl. 
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(iii) There is a constant cl depending only on L, such that for all x G S*, 
for all n, C{xi:n) < C{n) + cl- 

(iv) There is a constant cl depending only on L, such that for all a; G S*, 
for all n, C{xi:n) < logn + cl. 

Proof, (i) — > (ii): by similar proof as the KC-Regularity Lemma. 

(ii) (iii): obvious. 

(iii) — > (iv): obvious. 

(iv) ^ (i): 

Claim 1 For each constant c there are only finitely many one-way infinite bi- 
nary strings u) such that, for all n, C(a;i:„) < logn -I- c. 

Proof. The claim is a weaker version of Theorem 6 in 1^. It turns out that 
the weaker version admits a simpler proof. To make the treatment self-contained 
we present this new proof in the Appendix. □ 

By (iv) and the claim, there are only finitely many distinct x's associated 
with the x's in E*. Define the right-invariant equivalence relation r^hy x x' li 
X = x' ■ This relation induces a partition of S* in equivalence classes [x] = {y : 
y ^ x}. Since there is a one-one correspondence between the [xj's and the x's, 
and there are only finitely many distinct x's, there are also only finitely many 
[x]'s, which implies that L is regular by the Myhill-Nerode theorem. □ 

Remark I The KC-regularity Lemma may be viewed as a corollary of the 
Theorem. If L is regular, then clearly is regular, and it follows immediately 
that there are only finitely many associated x's, and each can be specified in 
at most c bits, where c is a constant depending only on L (and enumerator 
(j)). Ii y is, say, the nth string in L^, then we can specify y as the string 
corresponding to the nth 'I' in Xi using only C(n) + 0{1) bits to specify y. 
Hence C{y) < C{n) -f 0(1). Without loss of generality, we need to assume 
that the nth string enumerated in Lx in the KC-regularity Lemma is the string 
corresponding to the nth '1' in x by the enumeration in the Theorem, or that 
there is a recursive mapping between the two. 

Remark 2 If i is nonregular, then there are infinitely many a; e E* with 
distinct equivalence classes [x], each of which has its own distinct associated 
characteristic sequence x- It is easy to see, for each automaton (finite or infinite), 
for each x associated with an equivalence class [x\ we have 

C{xi:n\n)^ini{C{y):ye[x]] + 0{l), 

for n — > oo. The difference between finite and infinite automata is precisely 
expressed in the fact that only in the first case does there exist an a priori 
constant which bounds the lefthand term for all x- 
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We show how to prove positive results with the KC-Characterization Theo- 
rem. (Examples of negative results were given in the preceding section.) 

Example 7 Prove that L = S* is regular. There exists a constant c, such 
that for each x the associated characteristic sequence is x = Ijl)---) with 
C{xi:n\n) < c. Therefore, L is regular by the KC-Characterization Theorem. 

O 

Example 8 Prove that L = {x : x the number of 'I's in a; is odd} is regular. 
Obviously, there exists a constant c such that for each x wc have C(xi:n) < 
C (n) + c. Therefore, L is regular by the KC-Characterization Theorem. O 

4 Deterministic Context-free Languages 

We present a Kolmogorov complexity based criterion to show that certain lan- 
guages are not dcfl. In particular, it can be used to demonstrate the existence of 
witness languages in the difference of the family of context-free languages (cfls) 
and deterministic context-free languages (dcfls). Languages in this difference 
are the most difficult to identify; other non-dcfl are also non-cfl and in those 
cases we can often use the pumping lemma for context-free langiiages. The new 
method compares favorably with other known related techniques (mentioned in 
the Introduction) by being simpler, easier to apply, and apparently more pow- 
erful (because it works on a superset of examples). Yet, our primary goal is to 
demonstrate the usefulness of Kolmogorov complexity in this matter. 

A language is a dcfl iff it is accepted by a deterministic pushdown automaton 
(dpda). 

Intuitively, the lemma below tries to capture the following. Suppose a dpda 
accepts L = {0"1"2" : n > 1}. Then the dpda needs to first store a representa- 
tion of the all-0 part, and then retrieve it to check against the all-1 part. But 
after that check, it seems inevitable that it has discarded the relevant informa- 
tion about n, and cannot use this information again to check against the all-2 
part. That is, the complexity of the all-2 part should be C(n) = 0(1), which 
yields a contradiction for large n. 

Definition 5 A one-way infinite string lo = aJiL02 ■ ■ ■ over S is recursive if there 
is a total recursive function / : A/" — > S such that = f{i) for all i >1. 

Lemma 2 (KC-DCFL) Let L C S* be recognized by a deterministic pushdown 
machine M and let c be a constant. Let lu = luiuj2 ... be a recursive sequence 
over E which can be described in c bits. Let x,y gT,* with C{x,y) < c and let 
^ = . . . C2C1 (reversed) recursive sequence over E of the form . . . yyx. Let 

n,m G J\f and w € T,* be such that Items (i) to (iii) below are satisfied. 
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(i) For each i (1 < i < n), given M 's state and pushdown store contents 
after processing input Cm • ■ • Ci^i ■ • ■ ^ description of uj, and an additional 
description of at most c bits, we can reconstruct n by running M and observing 
only acceptance or rejection. 

(ii) Given M 's state and pushdown store contents after processing input 
Cm ■ ■ ■ Ci^i ■ ■ ■ ^n, we can reconstruct w from an additional description of at 
most c bits. 

(iii) K{uji . . .ujn) > 2 log log m. 

Then there is a constant c' depending only on L and c such that C^w) < c' . 

Proof. Let L be accepted by M with input head hr. Assume m, n, w satisfy 
the conditions in the statement of the lemma. For convenience we write 

U = Cm---Cl, V = UJi...UJn. 

For each input z G S*, we denote with c{z) the pushdown store contents at 
the time h^. has read all of z, and moves to the right adjacent input symbol. 
Consider the computation of M on input uv from the time when hr reaches the 
end of u. There are two cases: 

Case 1. There is a constant ci such that for infinitely many pairs m,n 
satisfying the statement of the lemma if hr continues and reaches the end of v, 
then all of the original c{u) has been popped except at most the bottom ci bits. 

That is, machine M decreases its pushdown store from size l{c{u)) to size 
Ci during the processing of v. The first time this occurs, let v' be the processed 
initial segment of v, and v" the unprocessed suffix (so that v = v'v") and let M 
be in state q. We can describe w by the following items.|^ 

• A self-delimiting description of M (including E) and this discussion in 
0(1) bits. 

• A self-delimiting description of lj in (1 + e)c bits. 

• A description of c{uv') and q in ci log + 0(1) bits. 

• The 'additional description' mentioned in Item (i) of the statement of the 
lemma in self-delimiting format, using at most (1 + e)c bits. Denote it by 
P- 

• The 'additional' description mentioned in Item (ii) of the statement of the 
lemma in self-delimiting format, using at most (1 -I- e)c bits. Denote it by 
r. 

^ Since we need to glue different binary items in the encoding together, in a way so that 
we can effectively separate them again, like {x,y) = x'y, we count C{x) + 21ogC(a;) -|- 1 bits 
for a self-delimited encoding x' = 1^^^^^^^ Ol{x)x of x . We only need to give self-delimiting 
forms for all but one constituent description item. 
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By Item (i) in the statement of the lemma we can reconstruct v" from M in 
state q and with pushdown store contents c{uv'), and w, using description p. 
Subsequently, starting M in state q with pushdown store contents c{uv'), we 
process v" . At the end of the computation we have obtained M's state and 
pushdown store contents after processing uv. According to Item (ii) in the 
statement of the lemma, together with description r we can now reconstruct w. 
Since C{w) is at most the length of this description, 

C{w) < 4c + cilog|S| + 0(1). 

Setting c' := 4c + ci log |E| + 0(1) satisfies the lemma. 

Case 2. By way of contradiction, assume that Case 1 does not hold. That 
is, for each constant ci all but finitely many pairs m, n satisfying the conditions 
in the lemma cause M not to decrease its stack height below ci during the 
processing of the v part of input uv. 

Fix some constant Ci . Set m, n so that they satisfy the statement of the 
lemma, and to be as long as required to validate the argument below. Choose 
u' as a suffix oiyy .. .yx with l{u') > 2™ and 

C(Z(u')) <loglogm. (2) 

That is, l{u') is much larger than l{u) (= m) and much more regular. A mo- 
ment's refiection learns that we can always choose such a u'. 

Claim 2 For large enough m there exists a u' as above, such that AI starts 
in the same state and accesses the same top l{c{u)) — Ci elements of its stack 
during the processing of the v parts of both inputs uv and u'v. 

Proof. By assumption, M does not read below the bottom ci symbols of 
c{u) while processing the v part of input uv. 

Wc argue that one can choose u' such that the top segment of c{u') is pre- 
cisely the same as the top segment of c(u) above the bottom ci symbols, for 
large enough l{u), l{u'). 

To see this we examine the initial computation of M on u. Since M is 
deterministic, it must either cycle through a sequence of pushdown store con- 
tents, or increase its pushdown store with repetitions on long enough u (and 
u'). Namely, let a triple {q, i, s) mean that M is in state q, has top pushdown 
store symbol s, and hr is at ith bit of some y. Consider only the triples (g, i, s) 
at the steps where M will never go below the current top pushdown store level 
again while reading u. (That is, s will not be popped before going into v.) 
There are precisely l{c{u)) such triples. Because the input is repetitious and 
M is deterministic, some triple must start to repeat within a constant number 
of steps and with a constant interval (in height of M's pushdown store) after 
M starts reading ?/'s. It is easy to show that within a repeating interval only a 
constant number of y's are read. 
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The pushdown store does not cycle through an a priori bounded set of push- 
down store contents, since this would mean that there is a constant ci such that 
the processing by M of any suffix oi yy . . . yx does not increase the stack height 
above ci. This situation reduces to Case 1 with v = e. 

Therefore, the pushdown store contents grows repetitiously and unbound- 
edly. Since the repeating cycle starts in the pushdown store after a constant 
number of symbols, and its size is constant in number of ?/'s, we can adjust u' 
so that M starts in the same state and reads the same top segments of c{u) and 
c(u') in the v parts of its computations on uv and u'v. This proves the claim. 

□ 

The following items form a description from which we can reconstruct v. 

• This discussion and a description of M in 0(1) bits. 

• A sclf-dclimiting description of the recursive sequence u of which v is an 
initial segment in (1 -|- e)c bits. 

• A self-delimiting description of the pair (x, y) in (1 + e)c bits. 

• A self-delimiting description of l{u') in (1 -I- €)C{l{u')) bits. 

• A program p to reconstruct v given uj and M's state and pushdown store 
contents after processing u. By Item (i) of the statement of the lemma, 
Kp) ^ Therefore, a self-delimiting description of p takes at most (l + e)c 
bits. 

The following procedure reconstructs v from this information. Using the de- 
scription of M and u' we construct the state qu' and pushdown store contents 
c(u') of M after processing u' . By Claim ||, the state g„ of M after processing 
u satisfies g„ — qu' and the top l{c{u)) — ci elements of c{u) and c(u') are the 
same. Run M on input to starting in state g„/ and with stack contents c{u'). By 
assumption, no more than l{c{u)) — c\ elements of c(it') get popped before we 
have processed ujx . . . By just looking at the consecutive states of M in this 
computation, and using program p, we can find n according to Item (i) in the 
statement of the lemma. To reconstruct v requires by definition at least C(v) 
bits. Therefore, 

C[v) < (l + e)C(/(w')+4c + 0(l) 

< (1 + e)loglogm + 4c + 0(l), 

where the last inequality follows by Equation ^. But this contradicts Item (iii) 
in the statement of the lemma for large enough m. □ 

Items (i) through (iii) in the KC-DCFL Lemma can be considerably weak- 
ened, but the presented version gives the essential idea and power: it suffices 
for many examples. A more restricted, but easier, version is the following. 
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Corollary 1 Let L C S* be a dcfl and let c be a constant. Let x and y be 
fixed finite words over E and let oj be a recursive sequence over E. Let u be a 
suffix of yy . . . yx, let v be a prefix of uj, and let w G T,* such that: 

(i) V can be described in c bits given L^ in lexicographical order; 

(ii) w can be described in c bits given L^v in lexicographical order; and 

(iii) C{v) > 2 log log /(u). 

Then there is a constant c' depending only on L,c^x^y,uj such that C{w) < c' . 

All the following context-free languages were proved to be not dcfl only with 
great effort before, 0, ||, |2^. Our new proofs are more direct and intuitive. 
Basically, if v is the first word in L„, then processing the v part of input uv 
must have already used up the information of u. But if there is not much 
information left on the pushdown store, then the first word w in L„„ cannot 
have high Kolmogorov complexity. 

Example 9 (Exercise 10.5 (a)** in 0) Prove L = {x : x = x^,x e {0, 1}*} 
is not dcfi. Suppose the contrary. Set m = 0"1 and w = 0", C(n) > logrt, satis- 
fying Item (iii) of the lemma. Since v is lexicographically the first word in L„, 
Item (i) of the lemma is satisfied. The lexicographically first nonempty word 
in Luv is 10", and so we can set w = 10" satisfying Item (ii) of the lemma. 
But now we have C{w) = ri(logn), contradicting the KC-DCFL Lemma and its 
Corollary. 

Approximately the same proof shows that the context-free language {xx^ : 
X G Y,*} and the context-sensitive language {xx : a; e E*} are not deterministic 
context-free languages. O 

Example 10 (Exercise 10.5 (b)** in 0, Example 1 in Q) Prove {0"1™ 
m = n,2n} is not dcfl. Suppose the contrary. Let m = O" and v = 1", where 
C(n) > logn. Then v is the lexicographically first word in L„. The lexicograph- 
ically first nonempty word in L^v is 1". Set w = 1", and C{w) = Q{\ogn), 
contradicting the KC-DCFL Lemma and its Corollary. O 

Example 11 (Example 2 in ||2^]) Prove L — {xy : l{x) = l{y),y contains a 
'1', x,y G {0,1}*} is not dcfl. Suppose the contrary. Set u — 0"1 where l{u) 
is even. Then v = 0"^^ is lexicographically the flrst even length word not in 
L„. With C{n) > logn, this satisfies Items (i) and (iii) of the lemma. Choosing 
w — 10^"+"^, the lexicographically first even length word not in Luv starting with 
a '1', satisfies Item (ii). But C{w) ~ f7(logn), which contradicts the KC-DCFL 
Lemma and its Corollary. O 

Example 12 Prove L — {0*P2'^ : i,j,k > 0, i = j or j = k} is not dcfi. 
Suppose the contrary. Let u — 0" and v — 1^ where C{n) > logn, satisfying 
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item (iii) of the lemma. Then, v is lexicographicaUy the first word in L„, 
satisfying Item (i). The lexicographic first word in n {1}{2}* is 12"+^. 
Therefore, we can set w = 12""'"^ and satisfy Item (ii). Then C{w) = fl{\ogn), 
contradicting the KC-DCFL Lemma and its Corollary. O 



Example 13 (Pattern-Matching) The KC-DCFL Lemma and its Corollary 
can be used trickily. We prove {x^yx^z : x,y,z £ {0, 1}*} is not dcfl. Suppose 
the contrary. Let u — 1"#, and v = 1"~^0 where C{n) > logn, satisfying Item 
(iii) of the lemma. Since v' = 1" is the lexicographically first word in L„, the 
choice of v satisfies Item (i) of the lemma. (We can reconstruct v from v' by 
flipping the last bit of v' from 1 to 0.) Then w = 1" is lexicographically the 
first word in L^v, to satisfy Item (ii). Since C{w) = f2(logn), this contradicts 
the KC-DCFL Lemma and its Corollary. O 



5 Recursive, Recursively Enumerable, and Be- 
yond 

It is immediately obvious how to characterize recursive languages in terms of 
Kolmogorov complexity. If L C S*, and S* = {vi, V2, ■ ■ ■} is effectively ordered, 
then we define the characteristic sequence A = Ai, A2, . . . of L by = 1 if 
Vi £ L and Ai = otherwise. In terms of the earlier developed terminology, if A 
is the automaton accepting L, then A is the characteristic sequence associated 
with the equivalence class [e]. Recall Definition |^ of a recursive sequence. A set 
L G S* is recursive iff its characteristic sequence A is a recursive sequence. It 
then follows trivially from the definitions: 

Theorem 3 (Recursive KC Characterization) A set L & E* is recur- 
sive, iff there exists a constant cj^ ( depending only on L) such that, for all n, 
C(Ai:„|n) < CL- 

L is r.e. if the set {n : A„ = 1} is r.e. In terms of Kolmogorov complexity, the 
following theorem gives not only a qualitative but even a quantitative difference 
between recursive and r.e. languages. The following theorem is due to Barzdin', 

§0. 

Theorem 4 (KC-R.e.) (i) If L is r.e., then there is a constant cl (depending 
only on L), such that for all n, C(Ai:„|n) < logn + c^. 

(ii) There exists an r.e. set L such that C(Ai:„) > logn, for all n. 

Note that, with L as in Item (ii), the set E* — L (which is possibly non- 
r.e.) also satisfies Item (i). Therefore, Item (i) is not a Kolmogorov complexity 
characterization of the r.e. sets. 
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Example 14 Consider the standard enumeration of Turing machines. Define 
k — kik2 ■ ■ ■ by fci = 1 if the ith Turing machine started on its iih program 
halts {4>i{i) < oo), and ki — otherwise. Let A be the language such that k 
is its characteristic sequence. Clearly, A is an r.e. set. In ||l| it is shown that 
C(fci:ra) > logn, for all n. O 



Example 15 Let k be as in the previous example. Define a one-way infinite 
binary sequence h by 

h = ki0'^k20'^\..h0'^'k,+i... 

Then, C(ft.i:„) — 0{C{n)) + 0(loglogn). Therefore, if h is the characteristic 
sequence of a set B, then B is not recursive, but more 'sparsely' nonrecursive 
than is A. O 



Example 16 The probability that the optimal universal Turing machine U 
halts on self-delimiting binary input p, randomly supplied by tosses of a fair 
coin, is fi, < J7 < 1. Let the binary representation of fl be 0.^1172 . . . Let S 
be a finite nonempty alphabet, and wi, U2, . . . an effective enumeration without 
repetitions of S*. Define i C E* such that 6 L iff fi^ = 1. It can be shown, 
see for example [Q, that the sequence fii, $72, . . . satisfies 

C(ili:„|n) > n — logn — 21oglogn — 0(1), 

for all but finitely many n. 

Hence neither L nor E* — L is r.e. It is not difficult to see that L G A2 — 
(El Ulli), in the arithmetic hierarchy (that is, L is not recursively enumerable), 

[HH. o 



6 Questions for Future Research 

(1) It is not difficult to give a direct KC-analogue of the uvwxy Pumping Lemma 
(as Tao Jiang pointed out to us). Just like the Pumping Lemma, this will show 
that {a"6"c" : n > 1}, {xx : x G E*}, {a^ : p is prime}, and so on, are not cfl. 
Clearly, this hasn't yet captured the Kolmogorov complexity heart of cfl. More 
in general, can we find a CFL-KC-Characterization? 

(2) What about ambiguous context-free languages? 

(3) What about context-sensitive languages and deterministic context-sensitive 
languages? 
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Appendix: Proof of Claim 



A recursive real is a real number whose binary expansion is recursive in the sense 
of Definition |^. The foUowing result is demonstrated in and attributed to 
A.R. Meyer. For each constant c there are only finitely many oj G {0, 1}°° with 
C{uji-n\n) < c for all n. Moreover, each such w is a recursive real. 

In Ig] this is strengthened to a version with C(a;i:„) < C(n) + c, and strength- 
ened again to a version with C{uJi:n) < logn + c. Claim |lj is weaker than the 
latter version by not requiring the tj's to be recursive reals. For completeness 
sake, we present a new direct proof of Claim avoiding the notion of recursive 
reals. 

Recall our convention of identifying integer x with the a;th binary sequence 
in lexicographical order of {0, 1}* as in Equation 0. 

Proof, [of Claim Q Let c be a positive constant, and let 

A„ = {a; e {0, 1}" : C{x) < logn + c}, (3) 
A = {cje{0,l}°°:V„eAA[C(a;i:„)<logn + c]}. 

If the cardinality d{An) of An dips below a fixed constant c', for infinitely 
many n, then c' is an upper bound on d{A). This is because it is an upper 
bound on the cardinality of the set of prefixes of length n of the elements in A, 
for all n. 

Fix any / e J\f. Choose a binary string y of length 21 + c + 1 satisfying 

Ciy)>2l + c+l. (4) 

Choose i maximum such that for division of y in y = mn with /(to) ~ i we have 

TO < d(A„). (5) 

(This holds at least for i — — m.) Define similarly a division y = sr with 
l{s) = i + 1. By maximality of i, we have s > d{Ar). From the easily proven 
s < 2m + 1, it then follows that 

d{Ar) < 2m. (6) 

We prove l{r) > I. Since by Equations ^ and || we have 

TO < d{A„) < 2^71, 

it follows that /(to) < l{n) + c. Therefore, 

2/ + c + 1 = l{y) = l{n) + l{m) < 2l{n) + c, 

which implies that l{n) > I. Consequently, Z(r) — l{n) — \ >l. 

We prove d{Ar) = 0{\). By dovetailing the computations of the reference 
universal Turing machine U for all programs p with /(p) < \ogn + c, we can 
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enumerate all elements of An- We can reconstruct y from the mth element, say 
yo, of this enumeration. Namely, from yo we reconstruct n since l{yo) = n, and 
we obtain m by enumerating An until yo is generated. By concatenation we 
obtain y = mn. Therefore, 

Ciy)<C{yo) + 0{l)<\ogn + c + 0{l). (7) 
From Equation ^ we have 

C(y) > logn + logm. (8) 

Combining Equations ^ and ^, it follows that logm < c + 0(1). Therefore, by 
Equation ^, 

d{Ar) < 2'=+o(i). 

Here, c is a fixed constant independent of n and m. Since l{r) > I and we can 
choose I arbitrarily, d{Ar) < co for a fixed constant cq and infinitely many r, 
which implies d(^) < co, and hence the claim. □ 

We avoided establishing, as in the cited references, that the elements of A 
defined in Equation ^ are recursive reals. The resulting proof is simpler, and 
sufficient for our purpose, since we only need to establish the finiteness of A. 

Remark 3 The difficult part of the Regular KC-Characterization Theorem 
above consists in proving that the KC-Regularity Lemma is exhaustive, i.e., 
can be used to prove the nonregularity of all nonregular languages. Let us look 
a little more closely at the set of sequences defined in Item (iii) of the KC- 
Characterization Theorem. The set of sequences A of Equation || is a superset 
of the set of characteristic sequences associated with L. According to the proof 
in the cited references, this set A contains finitely many recursive sequences 
(computable by Turing machines) . The subset of A consisting of the character- 
istic sequences associated with L, satisfies much more stringent computational 
requirements, since it can be computed using only the finite automaton recog- 
nizing L. If we replace the plain Kolmogorov complexity in the statement of 
the theorem by the so-called 'prefix complexity' variant K , then the equivalent 
set of A in Equation ^ is 

{lu e {0, 1}~ : yne^[K{u;i,n) < K{n) + c]}. 



which contains nonrecursive sequences by a result of R.M. Solovay, |21 
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